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Abstract — Collaborative intrusion detection networks are often 
used to gain better detection accuracy and cost efficiency as 
compared to a single host-based intrusion detection system (IDS). 
Through cooperation, it is possible for a local IDS to detect new 
attacks that may be known to other experienced acquaintances. 
In this paper, we present a sequential hypothesis testing method 
for feedback aggregation for each individual IDS in the net- 
work. Our simulation results corroborate our theoretical results 
and demonstrate the properties of cost efficiency and accuracy 
compared to other heuristic methods. The analytical result on 
the lower-bound of the average number of acquaintances for 
consultation is essential for the design and configuration of IDSs 
in a collaborative environment. 



I. Introduction 

As computer systems become increasingly complex, the 
accompanied potential threats also grow to be more sophis- 
ticated. Intrusion detection is the process of monitoring and 
identifying attempted unauthorized system access or manip- 
ulation. It is one of the most important tools for a network 
administrator to detect security breaches along with firewalls. 

An IDS can be categorized as either host-based or network- 
based. A host-based IDS (HIDS) is intended primarily to 
monitor a host, which can be a server, workstation, or any 
networked device, whereas a network-based IDS (NIDS) is 
used to protect a group of computer hosts by capturing and 
analyzing network packets. Even though these two types of 
IDSs are commonly employed in an enterprise network, they 
do not adequately leverage the possible information exchange 
between IDSs. The exchange of alert data or decisions be- 
tween administrative domains can effectively supplement the 
knowledge gained by a single local IDS. In a collaborative 
environment, an IDS can learn the global state of network 
attack patterns from its peers. By augmenting the information 
gathered from across the network, an IDS can have a more 
precise picture of an attacker's behavior and hence increase 
its accuracy and efficiency of detection. 

Collaborative intrusion detection networks (CIDNs) have 
distinct features from some other types of social networks such 
as P2P network and E-commerce network, where the collab- 
oration is one-time or short-term pattern. The collaboration in 
IDN is usually long-term based. Unlike other social networks, 
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communication in CIDNs is often of "low-cost", which leads 
to the possibility of using test messages (a communication 
overhead generated on purpose to test the reliability of the 
collaborators). 

Based on the aforementioned properties, we design a CIDN 
which utilizes test messages to learn the reliability of others 
and consultation requests to seek diagnosis from collaborators. 
The architecture design is shown in Figure Q] where NIDSs 
and HIDSs are connected into a collaboration network. Each 
IDS maintains a list of acquaintances (collaborators) and test 
messages are sent to acquaintances periodically to update its 
belief on peer reliability. When an IDS receives intrusion 
alerts and lacks confidence to determine the nature of the 
alerted source, alert messages are sent to its acquaintances 
for evaluation. An acquaintance IDS analyzes the received 
intrusion information and replies with a feedback of posi- 
tive/negative diagnosis. The ambivalent IDS collects feedback 
from its acquaintances and decides whether an alarm should 
be raised or not to the administrator. If an alarm is raised, the 
suspicious intrusion flow will be suspended and the system 
administrator investigates the intrusion immediately. 

In this paper, we design an efficient distributed sequential 
algorithm for IDSs to make decision based on the feedback 
from its collaborators. We investigate four possible outcomes 
of a decision: false positive (FP), false negative (FN), true 
positive (TP), and true negative (TN). Each outcome is asso- 
ciated with a cost. Our proposed sequential hypothesis testing 
based feedback aggregation provides improved cost efficiency 
as compared to other heuristic methods, such as the simple 
average model [1] and the weighted average model [2], [3]. In 
addition, the algorithm reduces the communication overhead 
as it aggregates feedback until a predefined FP and TP goal is 
reached. Our analytical model effectively estimates the number 
of acquaintances needed for an IDS to reach its predefined 
intrusion detection goal. Such result is crucial to the design of 
an IDS acquaintance list in CIDN. 

The remainder of this paper is organized as follows. In 
Section II, we review some existing CIDNs in the literature 
and IDS feedback aggregation techniques. The problem for- 
mulation is in Section III, where we use hypothesis testing 
to minimize the cost of decisions and sequential hypothesis 
testing to form consultation termination policy for predefined 
goals. In Section IV, we use a simulation approach to evaluate 
the effectiveness of our aggregation system and validate the 
analytical model. Section V concludes the paper and identifies 
directions for future research. 

II. Related Work 

Many CIDNs were proposed in the literature, such as 
Indra [4], DOMINO [5], and NetShield [6]. However, these 
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Fig. 1. A Collaborative Intrusion Detection Network 



works did not address the problem that the system might be 
degraded by some compromised insiders who are dishonest or 
malicious. 

Simple majority voting [7] and trust management are com- 
monly used to detect malicious insiders in CIDNs. Existing 
trust management models for CIDN are either linear as in 
[2], [8], or Bayesian model as in [3]. They are based on 
heuristic where the feedback aggregation is either a simple 
average [1] or a weighted average [3]. Moreover, no decision 
cost is considered in these models. In this paper, we use a 
sequential hypothesis testing model aiming at finding cost- 
minimizing decisions based on collected feedback. Existing 
work that applies hypothesis testing for intrusion detection 
includes [9] and [10], where a central data fusion center is 
used to aggregate results from distributed sensors in a local 
area network. However, their methodologies are limited to the 
context that all participants need to engage in every detection 
case. While in our context, IDSs may not be involved in all 
intrusions detection and the collected responses may be from 
different groups of IDSs each time. 

III. Problem Formulation 

In this section, we formulate the feedback aggregation as 
a sequential hypothesis testing problem. Consider a set of N 
nodes, TV, connected in a network, which can be represented 
by a graph Q = (TV, £). The set £ contains the undirected 
links between nodes, indicating the acquaintances of IDSs in 
the network. 

Let Yi, i g TV, be a random variable denoting the decision 
of IDS i observed by its peer IDSs on its acquaintance list 
TV. The random variable takes values in 3^ = [0, 1]. In the 
intrusion detection setting, Yi = says that IDS i decides that 
there is no intrusion while Yi — 1 means that IDS i raises 
an alarm of possible detection of intrusion. Each IDS makes 
its decision based upon its own experience of the previous 
attacks and its own sophistication of detection. We let pi as the 
probability mass function defined on 3^ such that pi (Yi = 0) 
and pi(Yi — 1) denote the probability of no intrusion and the 
probability of intrusion from i, respectively. 

We let Y* := [Yj]^ g y* := Uj^i be an 
observation vector of an IDS i that contains the feedback 
from its peers in the acquaintance list. Each IDS has two 



hypotheses Hq and H\. Ho hypothesizes that no intrusion 
is detected whereas Hi forwards a hypothesis that intrusion 
is detected and alarm needs to be raised. Note that we 
intentionally drop the superscript i because we assume that 
each IDS attempts to make the same decision. Denote by 
7TQ,7rj the apriori probabilities on each hypothesis such that 
7T* = F[H ],Tr{ = P[iJi] and w l a + ir{ = 1, for all i g TV. 
The conditional probability p l (Y l = y l \Hi), I = 1, 2 denotes 
the probability of a complete feedback being y % g IX/eAT ^ 
given the hypothesis. Assuming peers make decisions inde- 
pendently (this is reasonable if acquaintances are appropriately 
selected), we can rewrite the conditional probability as 



p i (Y i = y i \Hi) = J] pAY, = y j \H l ),ieAf,l = 0,1. 



(1) 



A hypothesis testing problem is one of finding a decision 
function 5 l (Y t ) : y 1 — > {0,1} to partition the observation 
space y l into two disjoint sets yi and y\, where y^ = {y l : 
<f(y 4 ) = 0}, and y\ = {y* : *V) = 1 }- 

To find an optimal decision function according to some 
criterion, we introduce the cost function C\ v ,1,1' — 0,1, 
which represents IDS i's cost of deciding that Hi is true when 
Hi> holds. More specifically, Cq X is the cost associated with 
a missed intrusion or attack and C\ refers to the cost of 
false alarm, while Cl ,C\i are the incurred costs when the 
decision meets the true situation. In several situations, it can 
be shown that decision functions can be picked as function 



p'(y'|gi) 



p i (y|ffo) 



(see [9], 



of the likelihood ratio given by L l (y l ) 
[10]) 

A threshold Bayesian decision rule is expressed in terms of 
the likelihood ratio and is given by 



if L l (y l ) > t % 
if L l {y l ) < t 1 



(2) 



(3) 



where the threshold t % is defined by 

i _ (Cio — CqoKo 

If the costs are symmetric and the two hypothesis are equal 
likely, then the rule in (O reduces to the maximum likelihood 
(ML) decision rule 



s i M L (y) = 



i ifp 4 (yitfi) >pWh ) 

ifp l (yi#i) < P *(y l \H ) 



(4) 



A. Sequential Hypothesis Testing 

In this section, we use sequential hypothesis testing to make 
decisions with minimum number of feedback from the peer 
IDSs, [11], [12]. An IDS asks for feedback from its acquain- 
tance list until a sufficient number of answers are collected. 
Let fl l denote all the possible collections of feedback in the 
acquaintance list to an IDS i and u) % £ f2* denotes a particular 
collection of feedback. Let N l (uj l ) be a random variable de- 
noting the number of feedbacks used until a decision is made. 
A sequential decision rule is formed by a pair (<j), 5), where 
4> l = {$ % n ,n £ N} is a stopping rule and 8 l = {5 % n ,n g N} 
is the terminal decision rule. Introduce a stopping rule with 
n feedback, : y l n := U.j&M i>n ^ ~> {°. i}. where M, n is 
the set of nodes an IDS i asks up to time n. 4> l n = indicates 
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that IDS i needs to take more samples after n rounds whereas 
4> l n = 1 means to stop asking for feedback and a decision can 
be made by the rule 8 l n . The minimum number of feedbacks 
is given by 



A^(u/) = min{n :^ = l,ti€ N}. 



(5) 



Note that N 1 (uj 1 ) is the stopping time of the decision rule. 
The decision rule 8 l is not used until N. We assume that no 
cost has incurred when a correct decision is made while the 
cost of a missed intrusion is denoted by C % M and the cost of 
a false alarm is denoted by C F . In addition, we assume each 
feedback incurs a cost D l . We introduce an optimal sequential 
rule that minimizes Bayes risk given by 

R i (<p i ,S i ) = R(<f, i ,S i \H )4 + R( ( j> i ,6 i \H 1 )7rl, (6) 

where R(4> 1 , S l \Hi), I — 0,1, are the Bayes risks under 
hypotheses Hq and Hi, respectively: 

R\6\8 l \H ) = C F f[5 N {Y h j e Ni, N ) = 1\H ) + D l E[N\H ], 
BttfrflHi) = CifVldNOrjJ G M,iv) = 0|Hi] + D^lNlHi]. 

Let V™(7Tq) = min^. ^i R i (<p i , <5 l ) be the optimal value 
function. It is clear that when no feedback are obtained from 
the peers, the Bayes risks reduce to 



i? J (0o - Mj = i) = c F ^ 07 



(7) 
(8) 



Hence, Hi is chosen when C f ttq < C 1 m tt\ or 7Tq < ni ; , 

F'M 

and Hq is chosen otherwise. The minimum Bayes risk under 
no feedback is thus obtained as a function of 7Tq and is denoted 
by 



if 7T < 



Cm 



T l {<) = \ ~ F "° " " u " ^F+cv (9) 

[ C\j(l — ttq) otherwise. 

The minimum cost function (O is a piecewise linear function. 
For 6 1 such that = 0, i.e., at least one feedback is 
obtained, let the minimum Bayes risk be denoted by J z (ttq) = 



needs to satisfy 



R l {4> l , S l ) . Hence, the optimal Bayes risk 



F i (7rj)=min{T i (4),J i (4)}. 



(10) 



Note that J 1 (7Tq) must be greater than the cost of one sample 
D l as a sample request incurs D l and J 1 (7Tq) is concave in 7Tq 
as a consequence of minimizing the linear Bayes risk ©. If the 
cost D l is high enough so that J 1 (ttq) > T*(ttq) for all ttq, 
then no feedback will be requested. In this case, V™(7Tq) = 
T z (7Tg), and the terminal rule is described in ([9). For other 
values of D % > 0, due to the piecewise linearity of T 1 (ttq) and 
concavity of J 1 (ttq), we can see that J 1 (ttq) and T 1 (ttq) have 
two intersection points tt' 1 l and ir l H such that ir^ < tt 1 h . It can 
be shown that for some reasonably low cost D l and 7Tq such 
that -k 1 l < 7Tq < ir'^, an IDS optimizes its risk by requesting 
another feedback; otherwise, an IDS should choose to raise an 
alarm when ttq < ir l L and report no intrusion when 7Tq < tt 1 l . 

Assuming that it takes the same cost D l for IDS i to acquire 
a feedback, the problem has the same form after obtaining a 
feedback from a peer. IDS i can use the feedback to update 



its apriori probability. After n feedback are obtained, 7Tq can 
be updated as follows: 



4(n) 
where L* := fl. 



(11) 



We can thus obtain the 



Aj'eM.n p(y,\H () ) ■ 
optimum Bayesian rule captured by Algorithm 1 below, known 

as the sequential probability ratio test (SPRT) for a reasonable 

cost D i . 

Algorithm 1 SPRT Rule for an IDS i 

Step 1: Start with n — 0. Use (fT2l ) as a stopping rule until 
61, — 1 for some n > 0. 



if 7T_ 

1 otherwise. 



(12) 



or in terms of the likelihood ratio L % n , we can use 

A 1 < L l < B % . Ai 4(1— 4f) 

.u • n , where A 1 = KL 

otherwise 



b i = r o if 

s ™ \ 1 otl 



(l-TT*)^ 



and 



(1— irj)7r£* 

Step 2: Go to Step 3 if 0^ = 1 or n = |A/i|; otherwise, 
choose a new peer from the acquaintance list to request a 
diagnosis and go to Step 2 with n = n + 1. 

Step 3: Apply the terminal decision rule as follows to 
determine whether there is an intrusion. 



51 = 



if 7Tq(ji) < tt\ 

if 7To(n) > 7T 



if < A % 
if Ll > 



B. Prior Probabilities 

In the above section, the conditional probabilities 
p l (yi\Hi),i £ AT, I — {0, 1} are assumed to be known. In this 
section, we use the beta distribution and its Gaussian approx- 
imation to find the probabilities. We let p l (yi — 0\Hi) := p l M 
be the probability of miss of an IDS i's diagnosis, also known 
as the false negative (FN) rate; and let p F := p l (yi = l|i?o) 
be the probability of false alarm or false positive (FP) rate. 
The probability of detection, or true positive (TP) rate, can be 
expressed as p l D = 1 — p l M . 

Based on historical data, an IDS j can assess the distribu- 
tions over its peer IDS i's probabilities of detection and false 
alarm as beta functions parameterized by af , af and f3f , j3f ; 



Pf 

i 

Pd 



Beta(a;V F ,/3» 
Beta(t/,|a l D ,/?l3) 



Wm^i"'" (I-**)"*- 1 , (13) 



r(«5 ? )r08j, 

n<*r>+f>p) ^"d- 1 
r(a« D )r03£,)»i 



(1 - yi f°-\{U) 



where Xi,yi £ [0,1]; af,af and j3f ,f3f are beta function 
parameters that are updated according to historical data as 
follows. 



k£M 



(1 



(15) 



k&M 



The introduction of the discount factors A^,A^, £ [0,1] 
allows more weights on recent data from IDS i while less on 
the old ones. The discount factors on the data can be different 
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for false negative and false positive rates. The parameter t\ 
denotes the time when fc-th diagnosis data is generated (and 
sent to its peer) by IDS i. The parameter r Fk ,r l Mk G [0,1] 



is the revealed results of the fc-th diagnosis data: r Fk = 1 
suggests that the fc-th diagnosis data from peer i yields a un- 
detected intrusion while r F k — means otherwise; similarly, 
r % D k = 1 indicates the data from the peer i results in a correct 
detection under intrusion and r % D k =0 suggests otherwise. 
The total reported diagnosis data is the set M. and they are 
classified into two groups: one is where the result is either 
false positive or true negative under no intrusion, denoted by 
the set Mo; and the other is where the result is either false 
negative or true positive under intrusion, denoted by the set 
Mi. Both sets are disjoint satisfying Mq U Mi = M and 
M r\Mi = 0. 

Each peer j can assess a peer i using ([T3T > and (Q3), where 
we have not included index j in the expressions for simplicity. 
However, it is clear that dT3l > and (fT3T > are assessed from the 
perspective of a certain IDS j. In addition, the discount factors 
in (T5[ need not be the same for all j. Hence, we can implicitly 
view cfT~5T > dependent on j. 

When parameters of the beta functions a and /3 in (fT3l l 
are sufficiently large, i.e., enough data are collected, beta 
distribution can be approximated by a Gaussian distribution 
as 



Beta(a,/3) w N 



aj3 



a + /3' I/ (a + /3) 2 (a + /3 + l) 



(17) 



Note that we have dropped the superscripts and subscripts in 
( flTt for generality as it can be applied to all i in (TT~3T > - Hence, 
using the Gaussian approximation and (fT5l l. the expected p l D 
and p\j are given by 



E[p F ] = 



(18) 



The mean values in ( fT8l under large data can be intuitively 
interpreted as the proportion of results of false alarm and 
detection in the set Mo and Mi, respectively. They can 
thus be used in (fl} as the assessment of the peer probability 
distribution pj. 

C. Threshold Approximation 

In the likelihood sequential ratio test of Algorithm Q] the 
threshold values A and B need to be calculated by finding 
and ir l H from J 1 (ttq) and T 1 (ttq) in (11 Ot . The search for these 
values can be quite involved using dynamic programming. 
However, in this subsection, we introduce an approximation 
method to find the thresholds. The approximation is based 
on theoretical studies made in [11] and [12] where a random 
walk or martingale model is used to yield a relation between 
thresholds and false positive and false negative rates. Let 
Pjj , P F be the probability of detection and the probability 
of false alarm of an IDS i after applying the sequential 
hypothesis testing for feedback aggregation. We need to point 
out that these probabilities are different from the probabilities 
p l D ,p F discussed in the previous subsection, which are the raw 
detection probabilities without feedback in the collaborative 
network. Let P l D and P F be reasonable desired performance 



bounds such that P F < P F , P l D > Pb- Then ' the thresholds 

pi * pi 

can be chosen such that A 1 = _ p p , B l = . 

The next proposition gives a result on the bound of the users 
that need to be on the acquaintance list to achieve the desired 
performances. 

Proposition 3.1: Assume that each IDS makes independent 
diagnosis on their peers' requests and each has the same 
distribution pi = p := p(-\H ),p\ = pi := p(-\Hi), 
PoiVi = 0) = e ,pi(yi = 0) = 6i, for all i G AT. 

Let Dkl{Po\\Pi) be the Kullback-Leibler (KL) divergence 
defined as follows. 



Dkl(po\\pi) = ^po(fc)ln 



fc=0 



Po(fc) 
Pi(fc)' 



(19) 



= O In ^ + (l-0 o )lnLJ?° (20) 

Likewise, the K-L divergence Dkl(Pi\\Po) can be defined. 
On average, an IDS needs Ni acquaintances such that 

> max (\- ^y„_J , fh.-j ) , (2D 
V Dkl{Po\\pi) DKLfaWpo) J 



where D 



M 



P F \n(§jz) +P D m(y#) and D F 



P>ln(^f) 
we need approximately Ni such that 



Pb In ( ±# ) . If Pb < 1 and P l M < 1, 



Ni > max 



P'r 



1 



Dkl(po\\pi 



},\- 



P' 
± 17 



Dkl(pi\\po 



(22) 



□ 



Proof: The conditional expected number of feedback 
needed to reach a decision on the hypothesis in SPRT can 
be expressed in terms of Pp and Pd, [11], [12]. 



Hence, to reach a decision we need to have at least 
max{E[AT|flo]>E[A^|iifi]} independent acquaintances. Under 
the assumption that both Pp and PL are much less than 1, 
we can further approximate 



E[N\H ] 



i-p; 



D 



-,E[N\Hi 



Dkl(po\\pi)' Dkl{pi\\po)' 
These lead us to inequalities (1221 and OTb . ■ 

IV. Experiments and Results 
In this section, we use simulations to evaluate the efficiency 
of the preceding feedback aggregation scheme and compare 
it with other heuristic approaches, such as the simple average 
aggregation and the weighted average aggregation. We validate 
and confirm our theoretical results on the number of acquain- 
tances needed for consultation. The results presented in this 
section are produced by averaging a large number of replica- 
tions with negligible confidence intervals. The parameters we 
use are shown in Table I. 
A. Simulation Setup 

The simulation environment uses an IDN of N nodes. Each 
IDS is represented by two parameters, expertise level I and 
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TABLE I 
Experimental parameters 



Parameter Value meaning 



tsa 0.5 decision threshold of the simple average model 

TwA 0.5 decision threshold of the weighted average model 

d 0.5 difficulty levels of intrusions and test messages 

Af>Ad 0.9 discount factors in d 1 5 b 

7To, 7Ti 0.5 probability of no-attack and under-attack 

Cqo, Cn cost of correct decisions 



decision threshold t p . At the beginning, each peer receives 
an initial acquaintance list containing all the other neighbor 
nodes. In the process of the collaborative intrusion detection, 
a node sends out requests to its acquaintances for intrusion 
assessments. The feedback collected are used to make a final 
decision, i.e., whether to raise an alarm or not. We implement 
three different feedback mechanisms, namely, simple average 
aggregation, weighted average aggregation, and hypothesis 
testing aggregation. We compare their efficiency by the av- 
erage cost of false decisions. 

1) Simple Average Model: If the average of all feedback 
exceeds a threshold tsa, then an alarm is raised. tsa is set 
to 0.5 if no cost difference is considered for making FP and 
FN decisions. The simple average mechanism to aggregate 
feedback is adopted in the literature such as [1]. 

2) Weighed Average Model: Weights are assigned to feed- 
back from different IDSs to calculate weighted average. 
Weighted average is widely used to aggregate feedback, such 
as [2] and [3], where weights are the trust values of IDSs 
and trust values are calculated based on their past history. If 
the weighted average is greater than a threshold twa, then an 
alarm is raised, twa is fixed to 0.5 in our experiments because 
their models do not consider the cost difference between FP 
and FN. In this simulation, we adopt trust values from [3] as 
the weights of feedback. 

B. Modeling of an Individual IDS 

To simulate the intrusion detection capability of each node, 
we use a Beta distribution for the decision model of an IDS. 
A Beta density function is given by 

1 



f(p\a,P) = 



B(aJ) 



L (l-P) 



p-i 



(23) 



a = 1 



1(1 -d) 



r, ,8 = 1 + 



1(1 - d) 



(1-r). 



d(l-Z) d(l-l) 

where B(a,j3) = /^^(l - tf^dt, p e [0,1] is the 
probability of intrusion assessed by the host IDS. f(p\a,/3) 
is the probability that a peer with expertise level / G [0, 1] 
answers with a value of p to an intrusion assessment of 
difficulty level d G [0,1]. Higher values of d are associated 
with attacks that are difficult to detect, i.e., many peers may fail 
to identify them. Higher values of / imply a higher probability 
of producing correct intrusion assessment, r G {0, 1} is the 
expected result of detection, r = 1 indicates that there is an 
intrusion and r = indicates that there is no intrusion. 

Let t p be the decision threshold of p. If p > t p , a peer sends 
feedback 1 (i.e., under-attack); otherwise, feedback (i.e., no- 
attack) is generated. 



For a fixed difficulty level, the preceding model assigns 
higher probabilities of producing correct intrusion diagnosis to 
peers with higher level of expertise. I = 1 or d = represent 
extreme cases where the peer can always accurately detect 
the intrusion. This is reflected in the Beta distribution with 
a, f3 — » oo. 

Figure [2] shows that both the FP and FN decrease when 
the expertise level of an IDS increases. We notice that the 
curves of FP rate and FN rate overlap. This is because 
the IDS detection density distributions are symmetric under 
r = and r = 1. Figure [3] shows that the FP rate decreases 
with the decision threshold while the FN rate increases with 
the decision threshold. When the decision threshold is 0, 
all feedback are positive (under-attack); when the decision 
threshold is 1, all feedback are negative (no-attack). 

C. Detection Accuracy and Cost 

One of the most important metrics to evaluate a feedback 
aggregation scheme is the cost of incorrect decisions. In this 
experiment, we study the costs of the three aggregation models 
using a simulated network. We set N = 10 and fix the 
expertise level I of all nodes to 0.5 and set C\o = Coi = 1 
in ^ for the fairness of comparison, since the simple average 
and the weighted average models do not account for the 
cost difference between FP and FN. We fix the decision 
threshold for each IDS (t p ) to 0.1 for the first batch run and 
then increase it by 0.1 in each subsequent batch run until it 
reaches 0.9. We measure the cost of the three models. As 
shown in Figure |4] the costs yielded by the aggregation using 
hypothesis testing remains the lowest among the three under 
all threshold settings. The costs of the weighted average and 
the simple average are close to each other. This is because 
in this experiment, the weights of all IDSs are the same. 
Therefore, the difference between the weighted average and 
the simple average is not substantial. We also observe that 
changing the threshold has a big impact on the costs of the 
weighted average and the simple average, while the cost of the 
hypothesis testing changes only slightly with the thresholds. 
All costs reach a minimum when the threshold is 0.5 and 
increase when it deviates from 0.5. 

In the next experiment, the expertise levels of all nodes 
remain 0.5 and their decision thresholds vary from 0.1 to 0.9. 
We set Cio = Cn = 1 in the first batch run and increase 
Cqi by 1 in every subsequent batch run. We observe the costs 
under three different models. Figure [5] shows that the costs 
of the simple average model and the weighted average model 
increase linearly with Coi while cost of hypothesis testing 
model grows the slowest among the three. This is because the 
hypothesis testing model has a flexible threshold to optimize 
its cost. The hypothesis testing model has superiority when 
the cost difference between FP and FN is large. 

D. Sequential Consultation 

In this experiment, we study the number of acquaintances 
needed for consultation to reach a predefined goal. Suppose the 
TP lower-bound Pq =0.95 and FP upper-bound Pp =0.1. 
We observe the change of FP rate and TP rate with the number 
of acquaintances consulted (n). Figure [6] shows that FP rate 
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decreases and TP rate increases with n. Consulting higher ex- 
pertise nodes leads to a higher TP rate and a lower FP rate. In 
the next experiment we implement Algorithm 1 on each node 
and measure the average number of acquaintances needed to 
reach the predefined TP lower-bound and the FP upper-bound. 
Figure [7] compares the simulation results with the theoretical 
results (see (|22]|). where the former confirms the latter. In 
both cases, the number of consultations decreases quickly with 
the expertise levels of acquaintances. For example, the IDS 
needs to consult around 50 acquaintances of expertise 0.2, 
while only 3 acquaintances of expertise 0.7 are needed for the 
same purpose. This is partly because low expertise nodes are 
more likely to make conflicting feedbacks and consequently 
increase the number of consultations. The analytical results 
can be useful for IDSs to design the size of their acquaintance 
lists. 

V. Conclusion 

In this paper, we have presented a sequential hypothesis 
testing approach to feedback aggregation in a collaborative in- 
trusion detection network. In this mechanism, an IDS consults 
sequentially for peer diagnoses until it is capable of making an 
aggregated decision that satisfies Bayes optimal cost criterion. 
The decision is made based on a threshold rule leveraging 
the likelihood ratio approximated by beta distribution and 
thresholds by target rates. Our experimental results show that 
our proposed feedback aggregation model is superior to other 
proposed models in the literature in terms of cost efficiency. 
Our simulation results have also corroborated our theoreti- 
cal results on the average number of acquaintances needed 
to reach the predefined false positive upper-bound and true 



positive lower-bound. As future work, we intend to investigate 
the robustness of the collaboration system against malicious 
insiders, especially under collusion attacks. Furthermore, we 
aim to extend our results to deal with the case of correlated 
feedbacks. 
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